Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 674545 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 51.5 MiB |
| Average record size in memory | 80.0 B |
Variable types
| DateTime | 1 |
|---|---|
| Categorical | 1 |
| Numeric | 8 |
Temperatura is highly correlated with Hora and 1 other fields | High correlation |
Temperatura_Aparente is highly correlated with Hora and 2 other fields | High correlation |
Radiacion_Solar is highly correlated with Hora and 1 other fields | High correlation |
Hora is highly correlated with Temperatura and 3 other fields | High correlation |
Humedad is highly correlated with Hora | High correlation |
Precipitacion is highly skewed (γ1 = 70.31982767) | Skewed |
Zona_Carga is uniformly distributed | Uniform |
Hora has 28109 (4.2%) zeros | Zeros |
Precipitacion has 639969 (94.9%) zeros | Zeros |
Velocidad_Viento has 12368 (1.8%) zeros | Zeros |
Radiacion_Solar has 334764 (49.6%) zeros | Zeros |
Nubosidad has 524204 (77.7%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-07 03:44:38.409822 |
|---|---|
| Analysis finished | 2022-11-07 03:44:57.843755 |
| Duration | 19.43 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
Fecha
Date
| Distinct | 4686 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.1 MiB |
| Minimum | 2010-01-01 00:00:00 |
|---|---|
| Maximum | 2022-10-30 00:00:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.1 MiB |
| Obregon | |
|---|---|
| Guaymas | |
| Caborca | |
| Nogales | |
| Navojoa |
Length
| Max length | 10 |
|---|---|
| Median length | 7 |
| Mean length | 7.499914757 |
| Min length | 7 |
Characters and Unicode
| Total characters | 5059030 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Obregon |
|---|---|
| 2nd row | Obregon |
| 3rd row | Obregon |
| 4th row | Obregon |
| 5th row | Obregon |
Common Values
| Value | Count | Frequency (%) |
| Obregon | 112428 | |
| Guaymas | 112428 | |
| Caborca | 112428 | |
| Nogales | 112428 | |
| Navojoa | 112428 | |
| Hermosillo | 112405 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| obregon | 112428 | |
| guaymas | 112428 | |
| caborca | 112428 | |
| nogales | 112428 | |
| navojoa | 112428 | |
| hermosillo | 112405 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 786996 | |
| o | 786950 | |
| r | 337261 | 6.7% |
| e | 337261 | 6.7% |
| s | 337261 | 6.7% |
| l | 337238 | 6.7% |
| g | 224856 | 4.4% |
| b | 224856 | 4.4% |
| N | 224856 | 4.4% |
| m | 224833 | 4.4% |
| Other values (11) | 1236662 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4384485 | |
| Uppercase Letter | 674545 | 13.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 786996 | |
| o | 786950 | |
| r | 337261 | |
| e | 337261 | |
| s | 337261 | |
| l | 337238 | |
| g | 224856 | 5.1% |
| b | 224856 | 5.1% |
| m | 224833 | 5.1% |
| c | 112428 | 2.6% |
| Other values (6) | 674545 |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 224856 | |
| O | 112428 | |
| C | 112428 | |
| G | 112428 | |
| H | 112405 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5059030 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 786996 | |
| o | 786950 | |
| r | 337261 | 6.7% |
| e | 337261 | 6.7% |
| s | 337261 | 6.7% |
| l | 337238 | 6.7% |
| g | 224856 | 4.4% |
| b | 224856 | 4.4% |
| N | 224856 | 4.4% |
| m | 224833 | 4.4% |
| Other values (11) | 1236662 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5059030 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 786996 | |
| o | 786950 | |
| r | 337261 | 6.7% |
| e | 337261 | 6.7% |
| s | 337261 | 6.7% |
| l | 337238 | 6.7% |
| g | 224856 | 4.4% |
| b | 224856 | 4.4% |
| N | 224856 | 4.4% |
| m | 224833 | 4.4% |
| Other values (11) | 1236662 |
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11.50121786 |
| Minimum | 0 |
|---|---|
| Maximum | 23 |
| Zeros | 28109 |
| Zeros (%) | 4.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 6 |
| median | 12 |
| Q3 | 18 |
| 95-th percentile | 22 |
| Maximum | 23 |
| Range | 23 |
| Interquartile range (IQR) | 12 |
Descriptive statistics
| Standard deviation | 6.9219012 |
|---|---|
| Coefficient of variation (CV) | 0.6018407168 |
| Kurtosis | -1.204014689 |
| Mean | 11.50121786 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | -0.0001813032955 |
| Sum | 7758089 |
| Variance | 47.91271622 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=24)
| Value | Count | Frequency (%) |
| 23 | 28116 | 4.2% |
| 1 | 28109 | 4.2% |
| 22 | 28109 | 4.2% |
| 21 | 28109 | 4.2% |
| 20 | 28109 | 4.2% |
| 19 | 28109 | 4.2% |
| 18 | 28109 | 4.2% |
| 17 | 28109 | 4.2% |
| 16 | 28109 | 4.2% |
| 15 | 28109 | 4.2% |
| Other values (14) | 393448 |
| Value | Count | Frequency (%) |
| 0 | 28109 | |
| 1 | 28109 | |
| 2 | 28031 | |
| 3 | 28109 | |
| 4 | 28109 | |
| 5 | 28109 | |
| 6 | 28109 | |
| 7 | 28109 | |
| 8 | 28109 | |
| 9 | 28109 |
| Value | Count | Frequency (%) |
| 23 | 28116 | |
| 22 | 28109 | |
| 21 | 28109 | |
| 20 | 28109 | |
| 19 | 28109 | |
| 18 | 28109 | |
| 17 | 28109 | |
| 16 | 28109 | |
| 15 | 28109 | |
| 14 | 28109 |
| Distinct | 543 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.60702918 |
| Minimum | -9.4 |
|---|---|
| Maximum | 47 |
| Zeros | 45 |
| Zeros (%) | < 0.1% |
| Negative | 1405 |
| Negative (%) | 0.2% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | -9.4 |
|---|---|
| 5-th percentile | 9.5 |
| Q1 | 17.8 |
| median | 24.3 |
| Q3 | 29.7 |
| 95-th percentile | 36.2 |
| Maximum | 47 |
| Range | 56.4 |
| Interquartile range (IQR) | 11.9 |
Descriptive statistics
| Standard deviation | 8.194188555 |
|---|---|
| Coefficient of variation (CV) | 0.3471079945 |
| Kurtosis | -0.361028029 |
| Mean | 23.60702918 |
| Median Absolute Deviation (MAD) | 5.9 |
| Skewness | -0.2756116589 |
| Sum | 15924003.5 |
| Variance | 67.14472607 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 28 | 3730 | 0.6% |
| 30 | 3623 | 0.5% |
| 29 | 3583 | 0.5% |
| 27 | 3469 | 0.5% |
| 27.7 | 3462 | 0.5% |
| 28.3 | 3459 | 0.5% |
| 28.4 | 3433 | 0.5% |
| 26 | 3411 | 0.5% |
| 28.2 | 3404 | 0.5% |
| 28.6 | 3392 | 0.5% |
| Other values (533) | 639579 |
| Value | Count | Frequency (%) |
| -9.4 | 2 | |
| -9.3 | 2 | |
| -8.8 | 3 | |
| -8.5 | 1 | < 0.1% |
| -8.1 | 1 | < 0.1% |
| -7.5 | 1 | < 0.1% |
| -7.3 | 1 | < 0.1% |
| -7.1 | 2 | |
| -6.9 | 1 | < 0.1% |
| -6.8 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 47 | 1 | < 0.1% |
| 46.8 | 1 | < 0.1% |
| 46.4 | 3 | |
| 46.2 | 1 | < 0.1% |
| 46.1 | 1 | < 0.1% |
| 46 | 4 | |
| 45.9 | 6 | |
| 45.8 | 4 | |
| 45.7 | 4 | |
| 45.6 | 4 |
| Distinct | 649 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.16692659 |
| Minimum | -18.2 |
|---|---|
| Maximum | 55 |
| Zeros | 103 |
| Zeros (%) | < 0.1% |
| Negative | 2621 |
| Negative (%) | 0.4% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | -18.2 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 17.2 |
| median | 25.2 |
| Q3 | 32.3 |
| 95-th percentile | 42.4 |
| Maximum | 55 |
| Range | 73.2 |
| Interquartile range (IQR) | 15.1 |
Descriptive statistics
| Standard deviation | 10.20227767 |
|---|---|
| Coefficient of variation (CV) | 0.4053843296 |
| Kurtosis | -0.5596479849 |
| Mean | 25.16692659 |
| Median Absolute Deviation (MAD) | 7.6 |
| Skewness | 0.01927843674 |
| Sum | 16976224.5 |
| Variance | 104.0864696 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 30.2 | 3001 | 0.4% |
| 30.1 | 2922 | 0.4% |
| 31 | 2882 | 0.4% |
| 30.4 | 2877 | 0.4% |
| 30.6 | 2865 | 0.4% |
| 29.9 | 2863 | 0.4% |
| 30.3 | 2857 | 0.4% |
| 29.7 | 2829 | 0.4% |
| 30.8 | 2819 | 0.4% |
| 31.1 | 2810 | 0.4% |
| Other values (639) | 645820 |
| Value | Count | Frequency (%) |
| -18.2 | 1 | |
| -17.9 | 1 | |
| -17.6 | 1 | |
| -16.8 | 1 | |
| -16.6 | 1 | |
| -14.9 | 1 | |
| -14.2 | 2 | |
| -14 | 2 | |
| -13.5 | 1 | |
| -12.5 | 1 |
| Value | Count | Frequency (%) |
| 55 | 1 | < 0.1% |
| 54.6 | 1 | < 0.1% |
| 54.1 | 1 | < 0.1% |
| 54 | 1 | < 0.1% |
| 53.7 | 1 | < 0.1% |
| 53.6 | 1 | < 0.1% |
| 53.5 | 1 | < 0.1% |
| 53.2 | 2 | < 0.1% |
| 53.1 | 2 | < 0.1% |
| 53 | 5 |
| Distinct | 714 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.03071824711 |
| Minimum | 0 |
|---|---|
| Maximum | 82 |
| Zeros | 639969 |
| Zeros (%) | 94.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.05 |
| Maximum | 82 |
| Range | 82 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.3194060635 |
|---|---|
| Coefficient of variation (CV) | 10.3979261 |
| Kurtosis | 13948.9958 |
| Mean | 0.03071824711 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 70.31982767 |
| Sum | 20720.84 |
| Variance | 0.1020202334 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 639969 | |
| 0.06 | 1920 | 0.3% |
| 0.07 | 1717 | 0.3% |
| 0.08 | 1523 | 0.2% |
| 0.09 | 1390 | 0.2% |
| 0.05 | 1147 | 0.2% |
| 0.1 | 1136 | 0.2% |
| 0.11 | 1009 | 0.1% |
| 0.12 | 970 | 0.1% |
| 0.13 | 878 | 0.1% |
| Other values (704) | 22886 | 3.4% |
| Value | Count | Frequency (%) |
| 0 | 639969 | |
| 0.01 | 74 | < 0.1% |
| 0.02 | 121 | < 0.1% |
| 0.03 | 101 | < 0.1% |
| 0.04 | 95 | < 0.1% |
| 0.05 | 1147 | 0.2% |
| 0.06 | 1920 | 0.3% |
| 0.07 | 1717 | 0.3% |
| 0.08 | 1523 | 0.2% |
| 0.09 | 1390 | 0.2% |
| Value | Count | Frequency (%) |
| 82 | 2 | |
| 41 | 1 | |
| 28.71 | 1 | |
| 28.34 | 1 | |
| 23.39 | 1 | |
| 22.94 | 1 | |
| 22.57 | 1 | |
| 19.32 | 1 | |
| 18.37 | 1 | |
| 17.79 | 1 |
| Distinct | 991 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.25697337 |
| Minimum | 1 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 13.7 |
| Q1 | 30.4 |
| median | 49.6 |
| Q3 | 69.8 |
| 95-th percentile | 88.7 |
| Maximum | 100 |
| Range | 99 |
| Interquartile range (IQR) | 39.4 |
Descriptive statistics
| Standard deviation | 23.7512555 |
|---|---|
| Coefficient of variation (CV) | 0.4725962171 |
| Kurtosis | -1.037861699 |
| Mean | 50.25697337 |
| Median Absolute Deviation (MAD) | 19.7 |
| Skewness | 0.08097398452 |
| Sum | 33900590.1 |
| Variance | 564.1221377 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 100 | 1801 | 0.3% |
| 27.9 | 983 | 0.1% |
| 34.2 | 978 | 0.1% |
| 46.5 | 970 | 0.1% |
| 32.4 | 956 | 0.1% |
| 33.1 | 946 | 0.1% |
| 35.2 | 940 | 0.1% |
| 33.4 | 939 | 0.1% |
| 28.7 | 939 | 0.1% |
| 28.8 | 939 | 0.1% |
| Other values (981) | 664154 |
| Value | Count | Frequency (%) |
| 1 | 3 | < 0.1% |
| 1.1 | 3 | < 0.1% |
| 1.2 | 3 | < 0.1% |
| 1.3 | 7 | |
| 1.4 | 7 | |
| 1.5 | 2 | < 0.1% |
| 1.6 | 3 | < 0.1% |
| 1.7 | 8 | |
| 1.8 | 7 | |
| 1.9 | 6 |
| Value | Count | Frequency (%) |
| 100 | 1801 | |
| 99.9 | 108 | < 0.1% |
| 99.8 | 82 | < 0.1% |
| 99.7 | 110 | < 0.1% |
| 99.6 | 91 | < 0.1% |
| 99.5 | 129 | < 0.1% |
| 99.4 | 113 | < 0.1% |
| 99.3 | 136 | < 0.1% |
| 99.2 | 108 | < 0.1% |
| 99.1 | 129 | < 0.1% |
| Distinct | 236 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.641592036 |
| Minimum | 0 |
|---|---|
| Maximum | 97.6 |
| Zeros | 12368 |
| Zeros (%) | 1.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.5 |
| Q1 | 1.5 |
| median | 2.3 |
| Q3 | 3.5 |
| 95-th percentile | 6 |
| Maximum | 97.6 |
| Range | 97.6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.742514137 |
|---|---|
| Coefficient of variation (CV) | 0.6596454385 |
| Kurtosis | 68.36427227 |
| Mean | 2.641592036 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.869468585 |
| Sum | 1781872.7 |
| Variance | 3.036355518 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 28253 | 4.2% |
| 2 | 26881 | 4.0% |
| 1.7 | 21081 | 3.1% |
| 1.8 | 20722 | 3.1% |
| 1.9 | 20552 | 3.0% |
| 2.2 | 19561 | 2.9% |
| 2.1 | 19523 | 2.9% |
| 1.6 | 19307 | 2.9% |
| 2.3 | 19257 | 2.9% |
| 1.5 | 18419 | 2.7% |
| Other values (226) | 460989 |
| Value | Count | Frequency (%) |
| 0 | 12368 | |
| 0.1 | 3546 | 0.5% |
| 0.2 | 4157 | 0.6% |
| 0.3 | 5082 | |
| 0.4 | 5502 | |
| 0.5 | 6513 | |
| 0.6 | 7534 | |
| 0.7 | 8896 | |
| 0.8 | 10142 | |
| 0.9 | 11316 |
| Value | Count | Frequency (%) |
| 97.6 | 1 | |
| 82.3 | 1 | |
| 82 | 1 | |
| 80 | 2 | |
| 74 | 1 | |
| 69.7 | 1 | |
| 66.1 | 1 | |
| 65.2 | 1 | |
| 65 | 1 | |
| 61.9 | 1 |
| Distinct | 10898 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 247.8187901 |
| Minimum | 0 |
|---|---|
| Maximum | 1104 |
| Zeros | 334764 |
| Zeros (%) | 49.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 1.8 |
| Q3 | 509.4 |
| 95-th percentile | 898.3 |
| Maximum | 1104 |
| Range | 1104 |
| Interquartile range (IQR) | 509.4 |
Descriptive statistics
| Standard deviation | 325.4899408 |
|---|---|
| Coefficient of variation (CV) | 1.313419134 |
| Kurtosis | -0.5636332541 |
| Mean | 247.8187901 |
| Median Absolute Deviation (MAD) | 1.8 |
| Skewness | 0.9544236694 |
| Sum | 167164925.8 |
| Variance | 105943.7015 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 334764 | |
| 0.2 | 182 | < 0.1% |
| 0.1 | 179 | < 0.1% |
| 0.4 | 173 | < 0.1% |
| 0.3 | 163 | < 0.1% |
| 1 | 158 | < 0.1% |
| 0.7 | 157 | < 0.1% |
| 0.5 | 153 | < 0.1% |
| 0.6 | 151 | < 0.1% |
| 1.1 | 149 | < 0.1% |
| Other values (10888) | 338316 |
| Value | Count | Frequency (%) |
| 0 | 334764 | |
| 0.1 | 179 | < 0.1% |
| 0.2 | 182 | < 0.1% |
| 0.3 | 163 | < 0.1% |
| 0.4 | 173 | < 0.1% |
| 0.5 | 153 | < 0.1% |
| 0.6 | 151 | < 0.1% |
| 0.7 | 157 | < 0.1% |
| 0.8 | 140 | < 0.1% |
| 0.9 | 134 | < 0.1% |
| Value | Count | Frequency (%) |
| 1104 | 1 | |
| 1103.9 | 1 | |
| 1103.8 | 1 | |
| 1103.2 | 1 | |
| 1101.5 | 1 | |
| 1100.8 | 1 | |
| 1100.7 | 1 | |
| 1100.2 | 1 | |
| 1099.5 | 1 | |
| 1099.3 | 1 |
| Distinct | 1001 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.238278543 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 524204 |
| Zeros (%) | 77.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.1 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 20.8 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 11.82839709 |
|---|---|
| Coefficient of variation (CV) | 3.652680563 |
| Kurtosis | 29.64148825 |
| Mean | 3.238278543 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.152592835 |
| Sum | 2184364.6 |
| Variance | 139.9109778 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 524204 | |
| 0.1 | 10073 | 1.5% |
| 0.2 | 5925 | 0.9% |
| 0.3 | 4884 | 0.7% |
| 0.4 | 4037 | 0.6% |
| 0.5 | 3240 | 0.5% |
| 0.6 | 2721 | 0.4% |
| 0.7 | 2623 | 0.4% |
| 0.8 | 2345 | 0.3% |
| 0.9 | 2068 | 0.3% |
| Other values (991) | 112425 | 16.7% |
| Value | Count | Frequency (%) |
| 0 | 524204 | |
| 0.1 | 10073 | 1.5% |
| 0.2 | 5925 | 0.9% |
| 0.3 | 4884 | 0.7% |
| 0.4 | 4037 | 0.6% |
| 0.5 | 3240 | 0.5% |
| 0.6 | 2721 | 0.4% |
| 0.7 | 2623 | 0.4% |
| 0.8 | 2345 | 0.3% |
| 0.9 | 2068 | 0.3% |
| Value | Count | Frequency (%) |
| 100 | 363 | |
| 99.9 | 73 | < 0.1% |
| 99.8 | 56 | < 0.1% |
| 99.7 | 44 | < 0.1% |
| 99.6 | 44 | < 0.1% |
| 99.5 | 32 | < 0.1% |
| 99.4 | 29 | < 0.1% |
| 99.3 | 35 | < 0.1% |
| 99.2 | 36 | < 0.1% |
| 99.1 | 33 | < 0.1% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Fecha | Zona_Carga | Hora | Temperatura | Temperatura_Aparente | Precipitacion | Humedad | Velocidad_Viento | Radiacion_Solar | Nubosidad | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2010-01-01 | Obregon | 0 | 14.7 | 13.7 | 0.0 | 58.9 | 2.9 | 0.0 | 0.0 |
| 1 | 2010-01-01 | Obregon | 1 | 14.3 | 13.6 | 0.0 | 61.5 | 2.6 | 0.0 | 0.0 |
| 2 | 2010-01-01 | Obregon | 2 | 12.5 | 12.1 | 0.0 | 64.8 | 2.5 | 0.0 | 0.0 |
| 3 | 2010-01-01 | Obregon | 3 | 12.1 | 11.8 | 0.0 | 68.8 | 2.3 | 0.0 | 0.0 |
| 4 | 2010-01-01 | Obregon | 4 | 10.4 | 10.4 | 0.0 | 73.0 | 2.1 | 0.0 | 0.0 |
| 5 | 2010-01-01 | Obregon | 5 | 9.7 | 8.9 | 0.0 | 71.9 | 2.0 | 0.0 | 0.0 |
| 6 | 2010-01-01 | Obregon | 6 | 9.1 | 8.2 | 0.0 | 77.8 | 2.0 | 0.0 | 0.0 |
| 7 | 2010-01-01 | Obregon | 7 | 8.9 | 7.9 | 0.0 | 79.3 | 2.0 | 0.0 | 0.0 |
| 8 | 2010-01-01 | Obregon | 8 | 8.7 | 7.6 | 0.0 | 79.0 | 2.0 | 0.0 | 0.0 |
| 9 | 2010-01-01 | Obregon | 9 | 9.1 | 9.4 | 0.0 | 72.6 | 2.0 | 120.8 | 0.0 |
Last rows
| Fecha | Zona_Carga | Hora | Temperatura | Temperatura_Aparente | Precipitacion | Humedad | Velocidad_Viento | Radiacion_Solar | Nubosidad | |
|---|---|---|---|---|---|---|---|---|---|---|
| 674535 | 2022-10-30 | Hermosillo | 14 | 28.9 | 30.4 | 0.0 | 15.0 | 0.9 | 682.8 | 0.0 |
| 674536 | 2022-10-30 | Hermosillo | 15 | 29.0 | 29.9 | 0.0 | 13.4 | 1.6 | 586.8 | 0.0 |
| 674537 | 2022-10-30 | Hermosillo | 16 | 29.8 | 28.5 | 0.0 | 12.7 | 3.6 | 448.0 | 0.0 |
| 674538 | 2022-10-30 | Hermosillo | 17 | 29.0 | 26.4 | 0.0 | 12.5 | 2.4 | 275.4 | 0.0 |
| 674539 | 2022-10-30 | Hermosillo | 18 | 27.0 | 24.2 | 0.0 | 17.2 | 2.1 | 125.7 | 0.0 |
| 674540 | 2022-10-30 | Hermosillo | 19 | 24.9 | 21.5 | 0.0 | 22.8 | 2.0 | 7.8 | 0.0 |
| 674541 | 2022-10-30 | Hermosillo | 20 | 22.0 | 19.6 | 0.0 | 33.0 | 1.8 | 0.0 | 0.0 |
| 674542 | 2022-10-30 | Hermosillo | 21 | 19.1 | 18.5 | 0.0 | 39.8 | 1.7 | 0.0 | 0.0 |
| 674543 | 2022-10-30 | Hermosillo | 22 | 18.1 | 17.9 | 0.0 | 41.8 | 0.3 | 0.0 | 0.0 |
| 674544 | 2022-10-30 | Hermosillo | 23 | 18.1 | 17.2 | 0.0 | 46.0 | 0.9 | 0.0 | 0.0 |